Training a calligraphy style classifier on a non-representative training set
نویسنده
چکیده
Calligraphy collections are being scanned into document images for preservation and accessibility. The digitization technology is mature and calligraphy character recognition is well underway, but automatic calligraphy style classification is lagging. Special style features are developed to measure style similarity of calligraphy character images of different stroke configurations and GB (or Unicode) labels. Recognizing the five main styles is easiest when a style-labeled sample of the same character (i.e., same GB code) from the same work and scribe is available. Even samples of characters with different GB codes from same work help. Style classification is most difficult when the training data has no comparable characters from the same work. These distinctions are quantified by distance statistics between the underlying feature distributions. Style classification is more accurate when several character samples from the same work are available. In adverse practical scenarios, when labeled versions of unknown works are not available for training the classifier, Borda Count voting and adaptive classification of style-sensitive feature vectors seven-character from the same work raises the ~70% single-sample baseline accuracy to ~90%.
منابع مشابه
Classifier Adaptation with Non-representative Training Data
We propose an adaptive methodology to tune the decision boundaries of a classi er trained on non-representative data to the statistics of the test data to improve accuracy. Speci cally, for machine printed and handprinted digit recognition we demonstrate that adapting the class means alone can provide considerable gains in recognition. On machine-printed digits we adapt to the typeface, on hand...
متن کاملCalligraphy Style Correlation Discovery Based on Graph Model and Its Applications
As more and more works of calligraphy exists in digital library, traditional browsing and searching are not satisfying. This paper presents an algorithm for calligraphy style correlation discovery based on graph model. We first segment calligraphy work into characters, extract their texture features through 64 Gabor channels, and estimate the calligraphy style using a probability multi-class SV...
متن کاملFeatures Extraction of Arabic Calligraphy using extended Triangle Model for Digital Jawi Paleography Analysis
The style of writing or calligraphy applied in ancient manuscripts gives useful information to paleographers. The information helps paleographer to identify date, writer, number of writers, place of origin, and the originality of manuscripts. This information is known as features. The features from characters, tangent value, dominant background and also Grey-Level Co-occurrence Matrix (GLCM) ha...
متن کاملIRDDS: Instance reduction based on Distance-based decision surface
In instance-based learning, a training set is given to a classifier for classifying new instances. In practice, not all information in the training set is useful for classifiers. Therefore, it is convenient to discard irrelevant instances from the training set. This process is known as instance reduction, which is an important task for classifiers since through this process the time for classif...
متن کاملThe effects of Chinese calligraphy handwriting and relaxation training in Chinese Nasopharyngeal Carcinoma patients: a randomized controlled trial.
BACKGROUND Chinese calligraphy handwriting is the practice of traditional Chinese brush writing, researches found calligraphy had therapeutic effects on certain diseases, some authors argued that calligraphy might have relaxation effect. OBJECTIVES This study was to compare the effects of calligraphy handwriting with those of progressive muscle relaxation and imagery training in Chinese Nasop...
متن کامل